Under consideration for publication in Theory and Practice of Logic Programming 



1 



Enhancing the Expressive Power of the 
U-Datalog Language 



ELISA BERTINO 

University of Milano 
Via Comelico 39, 20135 Milano, Italy 
e-mail: bertino@dsi.unimi.it 

BARBARA CATANIA 

University of Genova 
Via Dodecaneso 35, 16146 Genova, Italy 
e-mail: catania@disi.unige.it 

ROBERTA GORI 

University of Pisa 
Corso Italia, 40 
56125 Pisa, Italy 
e-mail: gori@di.unipi.it 



Abstract 

U-Datalog has been developed with the aim of providing a set-oriented logical update 
language, guaranteeing update parallelism in the context of a Datalog-like language. In 
U-Datalog, updates are expressed by introducing constraints (+p(X), to denote insertion, 
and — p(X), to denote deletion) inside Datalog rules. A U-Datalog program can be inter- 
preted as a CLP program. In this framework, a set of updates (constraints) is satisfiable if 
it does not represent an inconsistent theory, that is, it does not require the insertion and 
the deletion of the same fact. This approach resembles a very simple form of negation. 
However, on the other hand, U-Datalog does not provide any mechanism to explicitly deal 
with negative information, resulting in a language with limited expressive power. In this 
paper, we provide a semantics, based on stratification, handling the use of negated atoms 
in U-Datalog programs and we show which problems arise in defining a compositional 
semantics. 



1 Introduction 

Deductive database technology represents an important step towards the goal of 
developing highly-declarative database programming languages. Several approaches 
for the inclusion of update capabilities in deductive languages have been pro- 
posed. In general, all those proposals are based on including in rules, in addi- 
tion to usual atoms, special atoms denoting updates. In most of those proposals, 
an update execution consists of a query component, identifying the data to be 
modified, and an update component, performing the actual modification on the 
selected data. A way to classify deductive update languages is with respect to 
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the approach adopted for handling possible interferences between the query and 
update component of the same update execution. In particular, updates can be 
performed as soon as they are generated, as side-effect of the query evaluation, 
thus, by applying an immediate semantics. Languages based on an immediate se- 
mantics include CDC QNaqvi fc Tsur, 1989| ), TL ( |Abiteboul fc Vianu, 199l| >, DL 
HAbiteboul fe Vianu, 1991) , DLP < |Manchanda fc Warren, 1988D , Statelog < |Lausen et al, 1998) . 
The immediate semantics is in contrast with the deferred update semantics, by 
which updates are not applied as soon as they are generated during the query 
evaluation; rather, they are executed only when the query evaluation is completed. 
Languages based on a deferred semantics include U-Datalog ( |Bertino et al., 19 98b), 
Update Calculus QChen, 1995||Chen, 1997) , and ULTRA HWichert fc Freitag, 1997| 
|Wichert et al., 1998| ). Other languages, such as Transaction Logic pfonner fc Kifer, 1994) , 
provide both policies. 

In this paper, we consider U-Datalog, a language based on a deferred semantics. 
Even if more expressive and flexible frameworks exist (see for example ( |Bonner fc Kifer, 1994| 
Wic hert et al., 199 8)), the choice of U-Datalog is motivated by the fact that it rep- 
resents an immediate extension of Datalog to deal with updates. This aspect makes 
this language quite suitable for analyzing properties related to logical update lan- 
guages flBertino fc Catania, 1996| ). In U-Datalog, updates are expressed by intro- 
ducing constraints inside Datalog rules. For example, +p(a) states that in the new 
state p(a) must be true where — p(a) states that in the new state p(a) must be false. 
Thus, U-Datalog programs are formally modeled as Constraint Logic Programming 
(CLP) programs (Jaffar et al, 1998). 

In CLP, any answer to a given goal (called a query, in the database context) 
contains a set of constraints, constraining the resulting solution. In U-Datalog, 
each solution contains a substitution for the query variables and a set of updates. 
The execution of a goal is based on a deferred semantics. In particular, given a 
query, all the solutions are generated in the so-called marking phase, using a CLP 
answering mechanism. All the updates, contained in the various solutions, are then 
executed in the update phase, by using an operational semantics. The set of all 
updates generated during the marking phase forms a constraint theory which can 
be inconsistent. From a logical point of view, this means that the update set contains 
constraints of the form +p(a), —p(a), requiring the insertion and the deletion of the 
same fact. The U-Datalog computational model rejects any form of conflict, both 
locally, i.e., inside a single solution, and globally, from different solutions. Thus, the 
set of updates to be executed is always consistent. 

Besides the marking and update semantics phases, it is often useful to devise 
an additional semantics, known as compositional semantics | |Bertino et al., 1998b| ). 
This semantics, which is orthogonal with respect to the one defined above, charac- 
terizes the semantics of the intensional database independently from the semantics 
of the extensional one and is based on the notion of open programs ( |Bossi et al., 19 94). 
The compositional semantics is quite important in the context of deductive databases 
since it provides a theoretical framework for analyzing the properties of intensional 
databases. Indeed, it is always recursion free, even if it is not always finite. There- 
fore, when it is finite, it also represents a useful pre-compilation technique for in- 
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tensions! databases. However, since this semantics is usually expensive to compute, 
it is mainly used for analysis purposes. 

Even if U-Datalog allows us to easily specify updates and transactions, its expres- 
sive power is limited since no negation mechanism is provided, even if, due to update 
inconsistency, some limited form of negation on the extensional database is pro- 
vided. This kind of negation is obviously not sufficient to support a large variety of 
user requests. In this paper we provide an operational mechanism handling negated 
atoms in U-Datalog programs, providing a marking phase and a compositional se- 
mantics. The proposed extension is based on the notion of stratification, first pro- 
posed for logic programming and deductive databases ( |Abiteboul fc Vianu, 1991| 
|Lausen et al, 1998||Manchanda fc Warren, 1988||Naqvi fc Tsur, 1989| ). This exten- 
sion is not, however, a straightforward extension of previously defined stratification- 
based semantics for two main reasons. First of all, U-Datalog rules are not range 
restricted ( |Ceri et al., 1990| ) but are required to be safe through query invoca- 
tion, 1 resulting in a non-ground semantics. Note that, even if this is a typical 
assumption in a real context, most of the other deductive update languages re- 
quire range restricted rules or interpret free variables as generation of new values 
( |Abiteboul fc Vianu, 199l| l . A second difference is that an atom may fail not only 
because an answer substitution cannot be found but also because it generates an 
inconsistent set of updates. 

In the following, we first introduce U-Datalog in Section and we extend it to 
deal with negation in Section Finally, in Section 01 we present some conclusions 
and outline future work. Due to space limitations, we assume the reader to be 
aware of the basic notions of (constraint) logic programming ( |Jaffar et al., 19981 
|Lloyd, 1987| ) and deductive databases ( |Ceri et al., 1990| ). For additional details on 
U-Datalog, see ( |Eertino et al., 19 98b). 

2 U-Datalog 
2. 1 Syntax 

A U-Datalog database consists of: (i) an extensional database (or simply database) 
EDB, that is, a set of ground atoms [extensional atoms); (ii) an intensional database 
IDB (or simply program), that is, a set of rules of the form: 2 

H <— .,b k ,ui, ...,u s ,Bi,...,B t 
where H, B\, . . . , B t are atoms, b\, . . . ,bk are equality constraints, i.e. constraints of 
type X = t (denoted by b), where X is a variable and t is a term, and u±, . . . ,u s are 
update constraints (denoted by u), also called update atoms. An update constraint 
is an extensional atom preceded by the symbol +, to denote an insertion, or by the 
symbol — , to denote a deletion. 

In the following, the set of extensional predicates is denoted by Hedb, the set of 
intensional predicates is denoted by Hidb, and the Herbrand Universe is denoted 

1 See Section l2.1l for the formal definition of this property. 

2 In the following, we assume that constants and multiple occurrences of the same variable inside 
each atom are expressed by equality constraints between terms. 
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by TL (Lloyd, 1987). Moreover, a conjunction of equality and update constraints is 
simply called constraint? As usual in deductive databases, Hedb and Hidb arc 
disjoint. Note that a U-Datalog program can be seen as a CLP program where 
constraints are represented by equalities and update atoms. 

A U-Datalog transaction is a goal. In order to guarantee a finite answer to each 
goal and the generation of a set of ground updates, we assume that rules are safe 
through query invocation. This means that, given a U-Datalog database IDB and 
a goal G, each variable appearing in the head or in the update constraint of a rule, 
used in the evaluation of the goal, either appears in an atom contained in the body 
of the same rule, or is bound by a constant present in the goal. In this case, G is 
admissible for IDB. 

Example 1 

The following program is a U-Datalog intensional database: 

rl : remjman(X,Y) < dep_A(Y), empjman{X,Y) 

r2 : remjman(X , Y) < dep_A(Y), em,p_man{X, Z), remjnan(Z, Y) 

r3 : insjman(X) < ^dep_A(X),remjman(X,Y) 

An atom empjman(a,b) is true if is a manager of l a\ An atom remjman(a,b) 
is true if '6' is a (possibly indirect) manager of 'a'. As a side effect, it requires the 
removal of '6' from department A. An atom insjman(a) is true if 'a' has at least 
one manager and, as side effect, requires the insertion of 'a' in department A. At 
the same time, it requires the deletions of all the (possibly indirect) managers of 
'a' from department A. O 



2.2 Semantics 

U-Datalog constraints are interpreted over the Herbrand Universe H. In this do- 
main, equalities have the usual meaning: +p(X) is interpreted as the atomp(X) and 
—p(X) is interpreted as the negated atom -^p(X). If a constraint bAu is H-solvable, 
i.e. if H \= b, u, there exists at least one substitution that makes the constraint true. 
Thus, for no atom both an insertion and a deletion are simultaneously required. 
When this is not true, updates are said to be inconsistent. The execution of ground 
inconsistent update atoms (e.g., +p(a), — p(a)) may lead to different extensional 
databases, with respect to the chosen execution order. 

The generation of inconsistent updates is avoided as follows: (i) locally: a solution 
containing an inconsistent set of updates (i.e., an unsolvable set of constraints) is 
not included in the resulting set of solutions for the given goal; (ii) globally: if 
an inconsistency is generated due to two consistent solutions, the goal aborts, no 
update is executed, and the database is left in the state it had before the goal 
evaluation. 

The semantics of U-Datalog programs is given in two main steps. In the first 

3 In the following, conjunction between constraints is represented by using ',' inside body rules, 
and by using 'A' in other contexts. 
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step, all solutions for a given goal are determined by applying a CLP evaluation 
method (marking phase, see Section f2.2.2|) . Each solution contains a set of bindings 
for the query variables and a set of consistent update atoms. In the second step 
(update phase, see Section l2.2.3f) . the updates gathered in the various solutions are 
executed only if they are consistent. 

Besides the marking and the update semantics, an additional semantics is some- 
times introduced in the database context, which is called compositional semantics 
(see Section 12. 2. 111 . Such semantics characterizes the intensional database indepen- 
dently from the semantics of the cxtcnsional one. The compositional semantics is 
quite important in the context of deductive databases since it provides a theoretical 
framework for analyzing the properties of intensional databases. 

2.2.1 Compositional semantics 

Since the extensional database is the only time-variant component of a U-Datalog 
database, for analysis purposes, it is useful to define the semantics of a U-Datalog 
intensional database independently from the current extensional database. Such 
semantics is called compositional semantics and is always represented by a recursion 
free set of rules. Therefore, when it is finite, or when an equivalent finite set of rules 
can be detected, it also represents a useful pre-compilation technique for intensional 
databases. 

The compositional semantics can be defined assuming the intensional database 
to be an open program ( |Bossi et ai, 1994| ), i.e., a program where the knowledge 
regarding some predicates is assumed to be incomplete. Under this meaning, a U- 
Datalog intensional database can be seen as a program that is open with respect 
to the extensional predicates. The semantics of an open program is a set of rules, 
whose bodies contain just open predicates. In order to define the compositional 
semantics of a U-Datalog intensional database, we introduce the following set: 

idedb = {p(x) «- P (x) | P e n EDB }. 

In the previous expression, X denotes a list of distinct variables. Similarly to 
ino et ai, 1998b| ), we now introduce an unfolding operator. Such operator, 
given programs P and Q, replaces an atom p(X), appearing in the body of a rule 
in P, with the body of a rule defining p in Q. 

Definition 1 

Let P and Q be U-Datalog programs. Then 4 

Unfp(Q) — {p(X) *—V,u',H\,..., H n 3 a renamed rule 

3pi(Yi) <— bi,Ui, Hi G Q (i = 1, n), which share no variables, 
b' = /\ i (b i A(X i =Y i ))Ab 
u' = A 4 w 4 A u 

b' A v! is H-solvable} □ 

4 In the following, we use the notation [\ i c; to represent the conjunction of constraints c± A... Ac n , 
where n is clear from the context. The symbol = denotes syntactic equality. 
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The compositional semantics of a U-Datalog intensional database IDB is ob- 
tained by repeatedly applying the unfolding operator until no new rules are gener- 
ated. 

Definition 2 

The compositional semantics Uidb of IDB with respect to Hedb is defined as the 
least fixpoint of Tf DB (I) = Unfi DB {I U ID EDB )- □ 

The previous definition is based on the following result, taken from ( |Bertino et al., 1 998b) 

Theorem 1 

Tf DB is continuous. □ 
Theorem 2 

For any extensional database EDB, for any admissible goal G, the evaluation of G 
in IDB U EDB generates the same answer constraints than the evaluation of G in 
Uidb^EDB. □ 

Note that Uidb is always recursion free. If IDB is a recursive program then Uidb 
in general is not finite. However, under specific assumptions, it is equivalent to a 
finite set of rules (see Section EPjl . 



2.2.2 Marking phase 

The answers to a U-Datalog query can be computed in a top-down or in an equiv- 
alent bottom-up style (Ja ffar et al, 1998). Here, we introduce only the bottom- 
up semantics. The Constrained Herbrand Base B for a U-Datalog program is de- 
fined as the set of constrained atoms of the form p(X) <— b\ , b%, u\, u n where 
Mi, ...,u„ are update atoms, b\, ...bk are equality constraints, p G Hedb U Uidb, 
and X is a tuple of distinct variables. An interpretation is any subset of the Con- 
strained Herbrand Base. Given a U-Datalog database DB — IDBUEDB, 5 operator 
T DB :2 b ^2 b is defined as follows: 6 

Tdb(I) = {p(X)^- b ,u'\3 a renamed rule 

p(X)j- ft.u.pifyO, ■ • ■ ,p n (Y n ) e DB 

3pi(Xi) <— bi,Ui £ I (i = 1, n), which share no variables 

6' = A l (^A(A > l = y i ))A6 

u' = Ai Ui A u 

V A u 1 is H-solvable }. 7 



Even if ground atoms contained in the extensional database should be represented as constrained 
atoms, we still write them as ground atoms to simplify the notation. 
2 s is the set of all the subsets of the Constrained Herbrand Base B. 

We assume that all constraints generated by a fixpoint computation are projected onto the set 
of the head and update atom variables. Moreover, we assume that a constrained atom is inserted 
in the set being constructed only if it is not redundant. 
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Theorem 3 

Let DB be a U-Datalog database. Tdb is continuous and admits a unique least 

fixpoint TXX db and TXX db = Tdb] Such fixpoint represents the bottom-up 

semantics of DB. 8 □ 

Given TXXdb and a goal G =<— b, u,p±(X±), ...,p n (X n ), the solutions or answer 
constraints for G are all constraints X\ — Y±, X n — Y n ,b',u' such that PiiYi) <— 
6i, Ui G TXXdb (i — 1) — , m' = u A ui A ... A u n , 6' = 6 A 6i A ... A 6„, and 
u'A&'AXi = fiA...AX„ = f„ is W-solvable. Let 6" = Xi = YiA...AX n = f„A6'. 
In this case, we write G,DB (b",u'b"), where u'b" denotes the result of the 
application of the equalities specified in b" to u' . We assume that b" is restricted 
to the variables of G. Note that v! has to be consistent. 

Example 2 

Consider EDBi = {empjman(b, b), empjman(b,c), dep-A(b), dep-A(c), dep-B(b)} 
and the intensional database of Example ^ Transaction T\ =<— insjman(X) eval- 
uated in EDBi UIDB computes the consistent solution X = b,Y = c, — dep-A(Y), 
+dep-A(X). The additional solution X = b, Y = b, -dep-A(Y), +dep_A(X) is not 
consistent and therefore is discarded by the marking phase. O 



2.2.3 Update phase 

The update phase atomically executes the updates collected by the marking phase. 
Updates gathered by the different solutions for a given predicate are executed only 
if no inconsistency arises. This guarantees that only order independent executions 
are performed. Formally let u = {J{uj \ G,DB i — > (bj,Uj)}. Let EDBi be the 
current database state. If u is a consistent and ground set of updates, the new 
database EDB i+ i is computed as follows: EDB i+ i = (EDBi \ {p(t) I ~p{i) € 
u}) U {p(t') | +p(t') G u}. In this case, we say that G commits, returning the tuple 
({bj | G,DB (bj,Uj)}, EDBi+i, Commit). If u is inconsistent or contains at 
least one non-ground update atom, 9 we let EDBi + \ = EDBi and say that G aborts. 
In this case, the evaluation returns the tuple ({}, EDBi, Abort). 

Example 3 

Consider EDBi as in Example|2]and the intensional database of Example^ The ex- 
ecution of transaction T± =<— ms_man(X) generates the new extensional database 
EDBi + i = {emp_man(b,b), empjnan(b,c), dep-A(b), dep-B(b)}. O 



3 Introducing negation in U-Datalog 

Since solutions containing inconsistent updates are not returned by the marking 
phase, the U-Datalog semantics models some kind of negation. This form of negation 
is however very weak with respect to the ability to model arbitrary negation. Indeed, 

8 We recall that T P t = 0, T P t i = T P (T P t i - 1), T P "\ u = (J l<CJ T P t i 

9 Note that an unground set of updates can only be generated by a non-admissible goal. 
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it has been proved that, with respect to the returned substitutions, U-Datalog is 
equivalent to Datalog extended with negation on cxtcnsional predicates and open 
with respect to a subset of extensional predicates ( |Bertino fc Catania, 2000] |. In 
the following, in order to increase the expressive power of U-Datalog, we introduce 
negated atoms in the bodies of U-Datalog rules. The resulting language is called 
U-Datalog". Then, we assign a semantics to such language when the considered 
programs are stratified. A stratified U-Datalog" program is defined as follows. 

Definition 3 

A U-Datalog* 1 program IDB is stratified if it is possible to find a sequence P\ , . . . , 
Pn, Pi C IDB (i = 1, ...,n), (also called stratification) such that the following 
conditions hold (in the following, we denote with Predi the set of predicates defined 
in Pi): 

1. Pi, .. . ,P n is a partition of the rules of IDB. Each Pi is called "stratum". 

2. For each predicate q £ Predj, all the rules defining q in IDB are in Pj. 

3. If q(u) <—..., q'{v), . . . £ IDB, q' £ Predj, then q £ Pred k with j < k. 

4. If q(u) <— . . . , -*q'(v), . . . £ IDB, q' £ Predj, then q £ Pred k with j < k. □ 

The previous definition can be extended to deal with a U-Datalog database DB = 
IDB U EDB. In this case, all extensional facts belong to the first level. 

In order to assign a semantics to stratified U-Datalog" programs, we assume that 
each rule in the program is safe through query invocation. Due to the introduction of 
negation, the notion of safety is extended by requiring that each variable appearing 
in a rule head, in a negated literal contained in a rule body, or in an update atom 
also appears in a positive literal in the rule body or is bound by a constant present 
in the goal. 

The main differences between the bottom-up semantics we are going to present 
and the bottom-up semantics defined for Stratified Datalog" programs ( |Ceri et al, 1990| 
|Chandra fc Harel, 1985| ) are the following. Due to the condition of safety through 
query invocation, the semantics of a U-Datalog" program may contain non-ground 
constrained atoms that, however, will be made ground by the goal. Thus, negated 
atoms cannot be used, as usually done, as conditions to be satisfied by a solution. 
Indeed, some variables inside the generated solutions may be made ground by the 
goal. A solution to this problem is to explicitly represent, during the bottom-up 
computation, the solutions for which a negated atom ->B is true. In this way, we 
maintain all the conditions that the solutions have to satisfy but the check will be 
executed only when a match with a query goal is performed. To represent such so- 
lutions, the underlying constraint theory must be extended to deal with inequality 
constraints of type I/a, where A is a variable and a is a constant. For example, 
if X = a is the only solution for p(X), then A / a is the solution for -ip(A). 10 



Note that, due to the Closed Word Assumption jCeri et al., 1990} , this is only a difference at 
the presentation level that allows us to treat in an homogeneous way positive and negative 
literals during the bottom-up computation. 
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A second aspect is related to the semantics of —*B with respect to the updates 
collected by B. B, in fact, can also fail due to the generation of inconsistent up- 
dates. Thus, all solutions containing inconsistent updates represent solutions for 
-iB. Solutions for -*B in a database IDB U EDB are therefore obtained by eval- 
uating B in IDB U EDB, and complementing not only the computed constraints 
but also the constraints which ensure the consistency of the updates generated by 
evaluating B. 

Finally, we assume that the derivation of ->B does not generate any update. This 
assumption is motivated by the fact that the evaluation of -*B should be considered 
as a test with respect to the bindings generated by positive atoms. 

In the following, we present the marking phase and the compositional semantics 
for Stratified U-Datalog-i programs. Note that no modification to the update phase 
is required. Proofs of the presented results can be found in (|Bertino et al, 1999). 



3.1 The marking phase semantics 

As a natural extension of the constraints domain presented in ( |Bertino et al., 1998b| ), 
the Constrained Herbrand Base for U-Datalog^ (denoted by £P) consists of con- 
strained literals of the form L <— b,u, where b is a conjunction of equality and 
inequality constraints, u is a conjunction of update atoms, and L is a literal. If L is 
a negated atom, u is empty. In the following, the set of all conjunctions of equalities 
and inequalities constraints, constructed on the Herbrand Universe 7i, is denoted 
by C. 

Definition 4 

Let DB = IDBU EDB be a U-Datalog^ database. The bottom-up operator T^ B : 
2 B — > 2 B is defined as follows: 

T^ B (F) = {p(X)<— b' , u'\3 a renamed rule 

p{X) *- b,u,L 1 {Y 1 ),..., L n (Y n ) e DB 

3Li(Xi) bi,Ui £ I (i = 1, n), which share no variables 

V = A 4 (6* A (Xi = Yi)) A b 
it' = Ai Ati 

V A v! is 7i-solvable }. □ 

Before introducing the fixpoint semantics, we define an operator Neg which per- 
forms the negation of a constraint belonging to C. 

Definition 5 

Let c = c% A ... A c„. Neg : C — > 2 C is defined as follows: 

{c' 1: . . . , c' m } c[ V ... V c' m is equivalent to -ici V ... V ^c„ 
c\ is 7i-solvable (i = 1, ...,m) 
and Vj,j = 1, ...,m, 
c?i V . . . V d m is not equivalent to 

c' 1 v...v c ;_ 1 v4 +1 v...vC 

{false} otherwise 



Neg{c) 



□ 
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For example, if c = X = 2AY = 3, then Neg{c) = {X f 2, Y f 3}. Operator Neg 
is used to define an additional operator Comp, which takes a set S of constrained 
positive literals and returns the set H of constrained negative literals, belonging 
to the complement of S. This operator is used to make explicit the constraints 
for negative literals at the end of the computation of the positive literals of each 
stratum. 

Definition 6 

Comp : 2 B — > 2 B is defined as follows: 

• ->p(X) <— e Comp(S) if there does not exist any p(Y) <— b,u G S. 

• -np{X) <- b' G C*omp(S) iffp(X) <- 6i,u X) ,p(Jf) <- ^nj w n are the 

only (renamed apart) constrained atoms for p in S, b\ G Neg((bi A 6u)|_y), 
6m G Sol(uibi 12 ) (i = 1, ...,n), 6' = /\.&{, and 6' is ^-solvable. □ 

In the previous definition, Sol(u) is the set of minimal 13 constraints which im- 
plies that u is a consistent set of updates. For example, Sol(+p(a,Y), —p(X, Z), 
—p(X, b), —p(b, c)) = {X ^ a, Y ^ Z A Y ^ 6}. Of course, if u is an inconsistent 
set of update atoms, no solution is generated and Sol(u) — false. We also assume 
that all redundant constrained literals contained in C'omp(S) are removed. 

In the previous definition, operator Neg is applied to computed constraints and to 
the constraints which make satisfiable the updates generated by the corresponding 
positive atom. Such solutions have been restricted to the head variables since all 
the other variables are not needed to define the solutions for the negated atom. 

The fixpoint of T^ B is computed as follows. First, the fixpoint of a given stratum 
is computed. Then, all the facts that have not been derived are made explicitly 
false. This corresponds to locally apply the CWA. Note that, due to stratification, 
this approach is correct since each predicate is completely defined in one stratum. 

Definition 7 

Let DB = IDB U EDB be a stratified U-Datalog -1 database. Let (Pi)(i<i< n ) be a 
stratification for DB. The bottom- up semantics of DB is defined as TlX^g = M n 
where the sequence Mi, . . . , M n is computed as follows: 
Mi = Tp t |wU Comp(T^ f u) 

M i+ i = Tp. +lUMi T w U Comp(Tp i+iUM . | w), 1< i < n. □ 
Theorem 4 

Let DB be a stratified U-Datalog^ database. J-IX~^ B can be computed in a finite 
number of steps. □ 

Due to some basic results presented in dChandra & Harel, 1985| l, the bottom- 
up semantics of a stratified U-Datalog^ database is independent from the chosen 
stratification. 

11 C|jf denotes the projection of constraint c onto the variables in X (thus, all the other variables 
are eliminated by applying a variable elimination algorithm (Chang & Kcis ler, 1973} ). 

12 See note 9. 

13 Minimality is defined with respect to the order -< defined as follows: c -< c' if "H \= c' — * c 
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The answers to a given U-Datalog -1 goal are computed as described for U-Datalog 
programs in Subsection 12. 2. 21 by replacing TTXdb with J-2X^ B . 

Example 4 

Consider the extensional database EDB of ExampleEland the U-Datalog^ program 
IDB obtained by adding the following rules to the ones presented in Example 

r4 : changejman(X) < emp_man(X, Y),dep_B(X),dep_A(Y) 

r5 : change jman(X) <— X = Y, +emp_man(X, Y), dep_B(X), ->insjnan(X) 

An atom change _man(a) is now true if 'a' belongs to department B and if there 
exists at least one employee in department A. In this case, it removes all managers 
of 'a' belonging to department A. It is also true if 'a' belongs to department B and 
it has no manager. In this case, the evaluation removes all managers of 'a' belonging 
to department A and makes 'a' manager of itself. 

A possible stratification for DB = IDB U EDB is the following: P x = EDB U 
{rl,r2}, Pi = {^3}, P3 = {r4,r5}. TIX^B 1S computed as follows: 

t a = {emp_man(X, Y) <- X = b,Y = b; 
empjman(X, Y) <— X = b,Y = c; 

dep_A(X) <- X = b; dep.A{X) <- X = c; dep.B(X) <- X = b; 
rem.man(X, Y) <- X = b,Y = 6, -dep_A(Y)\ 
rem.man(X, Y) <— X = b, Y = c, -<2ep_A(Y); } 

Comp^Tp^ \ lo) — {->empjman(X,Y) 

->empjman(X, Y) <— Y ^ b,Y ^= c; 

-^dep-A(X) <— X =/= b, X c; -^dep.B(X) <- X ^ b; 

->remjman(X, f) ^ I / 6; 

—<remjman{X, Y) <— 1" ^ 6, Y ^ c; } 

Mi = Tp t t w U Comp{Tp x \ w) 

T p 2 uMi T w = {ms-man(A) <- A = 6, Y = c, -dep_A(Y), +dep.A{X)} U Mi 
Comp(Tp 2UMi I w) = {-ims_man(A) (-1/6} 
M 2 = Tp 2UMi T w U Comp(Tp 2UMi | w) 

Tp 3UM2 I w = {change jman(X) <— X = b,Y = c, —empjman(X, Y); 

changejman(X) <— A = 6, Y = 6, —empjman(X, Y)} U M2 
Comp(Tp uM2 | cj) = {-ichangejman(X) <-X^t} 
JRT^p = M 3 = Tp 3UM2 |wU Comp{Tp 3ljM2 t w). 

Note that rule r5 does not provide any additional answer for predicate 
changejnan. Indeed, ->insjnan generates the constraint X ^ b and depji gener- 
ates the constraint A = 6. Thus, the whole constraint is inconsistent. O 
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3.2 Compositional semantics 

The compositional semantics for U-Datalog programs ( Section 12.2.1(1 was defined 
by using an unfolding operator which replaces the atom p{X) in the body of a rule 
with the body of a rule defining p. Problems arise when unfolding negated atoms. 
Suppose we want to unfold —<p(X), then the disjunction of the bodies of all the rules 
that define predicate p in the compositional semantics has to be negated. Suppose 
the following rules represent the compositional semantics of a predicate p: 
p(X) <- 

p{X) <- b n ,u n ,L n . 

Since p(X) is true (due to the CWA) if and only if b\, Ui, L\ V . . . V b n ,u n ,L n 
is true, -<p(X) has to be unfolded with the negation of b\, u\, L\ V . . . V b n , u n , L n 
(since ->p{X) <-> V ... V b n , u n , L n )). li 

A problem arises when there exists an infinite set of rules defining p(X) in the 
compositional semantics. In this case, the unfolding operator cannot be applied since 
it is not effective. In order to solve this problem, a weaker notion of compositionality 
can be introduced, based on the restriction of the set of extensional databases 
with respect to which the intensional database can be composed. The additional 
information available on the considered extensional database has to guarantee that 
the result of the unfolding operator, which unfolds all the positive literals in the 
rule bodies, is finite. Gabbrielli et al. in (Ga bbrielli et al, 1993| ) showed that, when 
the Herbrand Universe H is finite, it is possible to compute a T-stable semantics of 
a logic program IDB, which is finite and gives the same answer constraints of IDB 
when composed with any extensional database defined on H. Intuitively, the T- 
stable semantics iterates the unfolding operator as many times as the new unfolded 
rules may give different results on the finite domain "H. 15 

Under these hypothesis, the compositional semantics for a stratified U-Datalog^ 
program corresponds to an unfolding semantics computed in two steps: 

1. In the first step, all positive literals in the rule bodies are unfolded, by comput- 
ing the T-stable semantics according to the algorithm given in flGabbrielli et al., 1 993 ) 
and the unfolding operator presented in Section l2.2.1l At this stage, negative 
literals are left unchanged. 

2. In the second step, negative literals are unfolded. Due to the finite domain 
assumption and results presented in ( |Gabbrielli et al., 1993| ), the set of rules 
required to unfold negative literals is finite. 



The way we unfold a negated atom —•p(X) corresponds to the syntactic transformation per- 
formed in the Clark's completion approach jClark, 1987t - However, while Clark's completion 
is used as a logical theory, our resulting unfolded program is evaluated by using a bottom-up 
stratified semantics. Therefore, we can prove that P and its unfolded version are equivalent 
w.r.t answer constraints (see Theorem^. Moreover, since we deal with stratified programs, no 
Li in the formula / = p(X) <-> (bi,u±, L± V ... V b n ,u n , L n ) can be equal to -^ p(X), since no 
cycle through negation arises in predicate definition. Thus, / is always consistent jClark, 1987} . 
Note that the finite domain assumption can be guaranteed only by executing updates which do 
not insert new values inside the database. 
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The result is a recursion free program written in an extended U-Datalog^ language 
which characterizes the semantics of the intensional database w.r.t. the extensional 
one. 

3.2.1 Unfolding of positive literals 

In order to compute the compositional semantics of a program IDB, we first un- 
fold positive literals by dealing with negative literals as if they were extensional 
predicates. This means that negative literal are not unfolded. To this purpose, the 
techniques presented in ( |Gabbrielli et al., 1993| ) are applied to obtain a finite set of 
rules, denoted by Ufp B ■ The basic idea of the T-stable semantics is illustrated by 
the following example. 

Example 5 

Consider rules rl and r2 presented in Example ^ and suppose that TL = {a, b}. 
After two iterations of the unfolding operator presented in Section [2.2.11 we obtain 
the following rules (call them T): 

remjman(X , Y) < dep_A(Y), empjman(X , Y) 

remjman(X , Y) < dep_A(Y), empjman(X, Z), empjman(Z , Y). 

At the third iteration of the unfolding operator we also obtain the rule 

remjman(X , Y) < dep-A(Y), emp_man{X, Z), 

empjman(Z, W), empjman(W, Y) 

which cannot infer different results on any database defined on two elements, since 
remjman(X,Y) computes the transitive closure of relation empjman. Thus, T 
corresponds to the T-stable semantics of the previous rules. Now suppose that 
H = {a, b, c}. The T-stable semantics Ufp B is computed as follows (in the following, 
U pos (Pj) denotes the set of rules contained in stratum j of Uf^n)- 

remjman(X , Y) < dep-A(Y), empjman(X, Y) 

remjman(X, Y) < dep-A(Y), empjman(X, Z), empjman(Z, Y) 

rem-man(X 7 Y) < dep-A(Y), emp-man(X 7 Z), 

empjman(Z, W), emp-.man(W, Y) 

insjman(X) < \-dep_A(X), —dep_A{Y) 1 empjman(X, Y) 

insjman(X) < irdep^A(X), —dep_A(Y), emp_man(X ', Z), 

emp_man(Z, Y) 

insjman(X) < l r dep_A(X), —dep_A(Y), empjman(X, Z), 

empjman(Z, W), emp-.man(W, Y) 

changejman(X) < empjnan(X, Y), dep-B(X), dep-A{Y) 

change _man(X) <— X = Y, +empjman(X, Y), dep-B(X), 

-iinsjman(X). O 

By results presented in ( |Gabbrielli et al., 1993| ) and jMaher, 1993| ), we can state 
the following results. 



Wdb =U pos {Pi) ■ 
U pos {P 2 ) : 

U pos {P 3 ) : 
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Theorem 5 

Let 7i be the fixed and finite Herbrand Universe. Let U\°db ^ ne T-stable semantics 
computed as described in ( |Gabbrielli et at, 1993| ), depending on the cardinality 
of Tt. For any extensional database EDB, U V i°db ^ EDB is equivalent to IDB U 
EDB. Moreover, U^ BB admits the same stratification of IDB and preserves goal 
admissibility. □ 



3.2.2 Unfolding of negative literals 

After constructing Ufpg, negative literals have to be unfolded. In order to unfold a 
negated literal -ip(X), the disjunction of the bodies of all the rules defining predicate 
p in Uf BB has to be negated. This approach should be applied stratum by stratum, 
generating in a finite number of steps a set of rules not containing negative literals. 
Note that, due to stratification conditions, the unfolding of ->p is required only in 
rules belonging to levels higher than the level where p is defined. The resulting set of 
rules corresponds to the compositional semantics of IDB. However, unfortunately, 
the negated disjunction of the bodies defining p is, in general, a first order formula, 
which cannot be represented in U-Datalog^, as the following example shows. 

Example 6 

Consider the intensional predicate p defined by the rule r : p(X) <— X = a, f(X, Y), 
q(X,Y), where / and q are extensional predicates. The previous rule is logically 
equivalent to the following first order formula: p(X) <— X = a A 3Y (f(X, Y) A 
q(X,Y)). By assuming that r is the only rule defining p, by CWA, we obtain 
that -<p{x) <-> ->(X = oA 3Y (f(X, Y) A q(X, Y))). But ->(X = a A 3Y (f(X, Y) A 
q(X, Y))) is logically equivalent to (X a) V(VF Y)V^q(X, Y))), which can 

always be transformed in prenex disjunctive normal form (Maher, 1988) obtaining 
VY(X ^aV^f(X,Y)V^q(X,Y)). O 

From the previous example it follows that, in order to unfold negative literals, the 
U-Datalog^ syntax has to be extended to deal with first order formulas. As shown 
in the example, the variables which become quantified after negation, correspond to 
local variables of the original rule, i.e., body variables not appearing in the rule head. 
After this extension, the syntax of a U-Datalog -1 rule, hereafter called extended U- 
Datalog"' rule, becomes the following: 

H <- b, u, L o Q(h,H! V ... Vb n , H n ) 
where H is an atom, u is an update constraint, b is a conjunction of equality 
constraints, L is a conjunction of positive literals, Q{b\,H\ V ... V b n ,H n ) is a 
first order formula in prenex disjunctive normal form, where Q is a sequence of 
quantified variables, not appearing in H or in L, each bt is a conjunction of equality 
and inequality constraints, each Hi is a conjunction of literals. 16 Intuitively, b,u,L 
is generated by the unfolding of positive literals whereas the first order formula 



Note that the proposed extensions are performed at the body rule level. No change to the 
Herbrand Base is performed. 
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Q{b\,H\ V ... V b n , H n ) is generated by the unfolding of negative literals. Of course, 
we still assume that rules are stratified. 

An extended U-Datalog rule body is true in a given interpretation if there exist 
some bindings for the positive literals and some bindings for the free variables of 
the quantified formula which make the rule body true in the given interpretation. 
Formally, the truth of an extended U-Datalog rule body can be defined as follows. 

Definition 8 

Let R = b, u, Li(Yi), . . . , L m (Y m ) o F where F = Q(h, Hx V ... V b n , H n ). Let 
X\.,,,.X n be the free variables of F. Let / C £P. R is true in I with answer 
constraint b,u if there exist Li(Ui) <— bi, Ui € I (i = 1, . . . , m), and c = {X\ = 
h A . . . A X n = t n ) (tj EH, j = 1, n), such that b_= ^b.AbAc ^(Y, = Ui), 
u = u A (u\ A ... A u m ), 6 A u is 7i-solvable, and I \= b A F. 17 □ 

The bottom-up operator of an extended U-Datalog^ program can be now defined 
as follows (in the following, body(r) denotes the body of a rule r). 

Definition 9 

Let DB = IDB U EDB be an extended U-Datalog" database. The bottom-up 
operator T^ B : 2 B — > 2 B is defined as follows: 

Tp B (I) = {p(X) <~ b,u\3 a renamed rule 

r :p(X) <- &,{t,Lo<3(6 1 ,^"i V ... Vb n ,H n ) e DB 

b,u is an answer constraint for body(r) in 1} □ 

Theorem 6 

Let Hbea finite domain. Tf) B is a continuous operator. □ 

Given an extensional database EDB and an extended U-Datalog" program IDB, 
the semantics of DB = IDB U EDB is obtained as the least fixpoint of Tg B , 
denoted by TXX e DB . Due to the presence of a first order formula in rule bodies, a 
new safeness through query invocation property has to be stated. 

Definition 10 

Let P be an extended Datalog" program. P is safe through query invocation if 
each non-quantified variable, appearing in a rule head, in an update atom, or in a 
negated atom, also appears in a positive literal of the rule body or is bound by a 
constant present in the goal. □ 

The unfolding operator we are going to define works stratum by stratum. First, 
positive literals in the rule bodies of each stratum are unfolded by using operator 
U V i°db - Then, the negative literals contained in the i-th stratum ofUf^ B are unfolded 
by using an operator U™p B and the rules resulting from the completed unfolding 
of strata 1, ...,i — 1. The result is an extended U-Datalog" program equivalent to 
IDB but not containing positive or negative intensional literals. 

17 I \= b A F means that for any assignment of values to quantified variables, satisfying b, it is 
possible to find some constrained literals in / unifying with those in F, such that the resulting 
constraint is W-solvable. 



1G 



Elisa Bertino, Barbara Catania, and Roberta Gori 



Before presenting the unfolding operator, we define the operator U™^ B which 
unfolds the negative literals of a U-Datalog^ program IDB, by using the rules of 
an extended U-Datalog -1 program U . In order to define function U™p B , we need an 
operator, Neg c , which takes the disjunction of a set of (extended) U-Datalog^ rule 
bodies defining a predicate p, performs its logical negation and returns the resulting 
first order formula in prenex disjunctive normal form. Such formula is then used to 
construct the unfolded rule. Information on the head variables of each rule defining 
p is useful to understand which are the local variables, that is, the variables which 
have to be quantified. In performing negation, also constraints which make the 
update atoms satisfiable have to be considered, similarly to what has been done for 
operator Comp (see Definition . 

Definition 11 

Let P e — {IDB\IDB is an extended U-Datalog^ program}. Let IDB be a strati- 
fied U-Datalog^ intensional database. The unfolding operator U^ BB - p e ^p e [ s 
defined as follows: 

Ujp S b (U) = {p(X) <— b, u, Ai, . . . , A n ,oF\3 a renamed rule 
p{X) <^b,u,Ai,...,A {Z m ) G IDB 

for all pi (i = 1, m), consider the body of all the rules r* (j — 1, U) 
defining pi in U 

r\ : Pi(Vi,i) <— 6j,i,Ui,i,Li,i o G U 

rt ( ■Pi{Vi yh )<-b it i i ,Ui : i i ,Li th oF i .i i eU 
F 18 = Q(ci,Li V...Vc e ,L e ) G 

Neg c {(Z! = Vi.i A body(r\) V ... V Zi = Vi, h A body(rlJ) A ...A 
(Z~ m = V m ,i A body(rT) V ... M Z m = V m ,i m A feody(r^))) 
b Aii A Q(ci V ... V c e ) is H-solvable } □ 

By using operator U} 1 ^, the compositional semantics is defined as follows. 

Definition 12 

Let IDB be a stratified U-Datalog^ program. Suppose that IDB, and therefore 
Ujd B , admits a stratification with k strata P\,...,Pk- Let U pos (Pj) be the set of 
rules contained in stratum j of hi^ B ■ The compositional semantics IAidb is defined 
as follows (see Subsection 12 . 2 . 1 1 for the definition of IDedb)'- 

T\=KZ {Pi) (IDedb) __ 
P2=U™l m {ID EDB l>P 1 ) 

Pi = U™ 9 os(Pi) (ID EDB UP i _ 1 ) 

UiDB=U 1 < t < k P l . □ 
Theorem 7 

Uidb is safe through query invocation and for each admissible goal G and for each 
extensional database EDB, the answer constraints for G in TIX~l DB U EDB are 
the same than the ones in FlXf, U EDB. □ 



Note that no updates are generated. 
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Example 7 

Consider the program IDE and its stratification, as presented in Example^] and its 
positive unfolding Mfp B , as presented in Example El The compositional semantics 
of IDB is constructed as follows: 

Pi =Pi P2 =P 2 

P3 =changejman(X) < emp_man(X,Y),dep-B(X),dep-A(Y) 

change-man(X) «— X = Y, +empjman(X , Y), dep-B(X)o 
\/Z(X = Z V -^emp_man(X , Z)) 

U IDB =PiUP 2 UP 3 . 

The second rule for predicate change-man derives from the unfolding of 
predicate -iinsjman(X). The formula \/Z(X = Zv->empjman(X, Z)) is the simplified 
result of the application of the Neg c operator to the disjunction of the bodies of 
the rules defining insjman. O 

It is important to remark that the compositional semantics has not to be con- 
sidered as an alternative semantics w.r.t. the marking phase semantics. Indeed, the 
computation of the compositional semantics can be quite expensive. However, since 
the compositional semantics is recursion free and has to be computed just once 
(unless the Herbrand domain changes) it can be meaningfully used in some cases 
as a precompilation technique. 



4 Concluding remarks 

In this paper we have introduced negation inside U-Datalog rules and proposed a 
stratification-based approach to assign a semantics to such programs. We have also 
introduced a weaker concept of compositionality and presented a finite and effec- 
tively computable compositional semantics for U-Datalog" programs. By results 
presented in ( |Bertino fc Catania, 2000| ), it is quite immediate to prove that, with 
respect to the returned answers, U-Datalog" is equivalent to Stratified Datalog", 
open with respect to a subset of extensional predicates ( |Bertino et al, 1999| ). The 
presented results can be extended to deal with other Datalog-like update languages 
safe through query invocation. Future work includes the introduction of negation 
in other U-Datalog extensions ( |Bertino et al., 2000| |Bertino et al., 1998a| ) and the 
definition of static analysis techniques for U-Datalog" , similarly to those proposed 
for U-Datalog ( |Bertino fc Catania, 19961 ). 
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